Study of Classification Algorithms using Moment Analysis

نویسندگان

  • Amit Dhurandhar
  • Alin Dobra
چکیده

In this short paper we briefly discuss a moment based method that was recently introduced to study the behavior classification algorithms and model validation techniques for finite sample sizes. The method involves accurate and efficient computation of the moments of the generalization error which are over the space of all possible datasets of size N drawn from an underlying distribution. A classification algorithm trained on each of these datasets induces a space of classifiers (i.e. an empirical hypothesis space) and the moments can be equivalently computed over this space. In our previous work we also drew relationships between the moments of the generalization error and moments of hold out error, cross-validation error, leave one out error and hence these model validation techniques can also be studied accurately by our method. The primary goal of this paper is to familiarize machine learning researchers with this newly proposed methodology, so as to discuss its implications regarding important problems such as classification model selection.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of Pre-processing and Post-processing Methods and Using Data Mining to Diagnose Heart Diseases

Today, a great deal of data is generated in the medical field. Acquiring useful knowledge from this raw data requires data processing and detection of meaningful patterns and this objective can be achieved through data mining. Using data mining to diagnose and prognose heart diseases has become one of the areas of interest for researchers in recent years. In this study, the literature on the ap...

متن کامل

Comparison of Machine Learning Algorithms for Broad Leaf Species Classification Using UAV-RGB Images

Abstract: Knowing the tree species combination of forests provides valuable information for studying the forest’s economic value, fire risk assessment, biodiversity monitoring, and wildlife habitat improvement. Fieldwork is often time-consuming and labor-required, free satellite data are available in coarse resolution and the use of manned aircraft is relatively costly. Recently, unmanned aeria...

متن کامل

PERFORMANCE-BASED OPTIMIZATION AND SEISMIC COLLAPSE SAFETY ASSESSMENT OF STEEL MOMENT FRAMES

The main aim of the present study is to optimize steel moment frames in the framework of performance-based design and to assess the seismic collapse capacity of the optimal structures. In the first phase of this study, four well-known metaheuristic algorithms are employed to achieve the optimization task. In the second phase, the seismic collapse safety of the obtained optimal designs is evalua...

متن کامل

Predicting The Type of Malaria Using Classification and Regression Decision Trees

Predicting The Type of Malaria Using Classification and Regression Decision Trees Maryam Ashoori1 *, Fatemeh Hamzavi2 1School of Technical and Engineering, Higher Educational Complex of Saravan, Saravan, Iran 2School of Agriculture, Higher Educational Complex of Saravan, Saravan, Iran Abstract Background: Malaria is an infectious disease infecting 200 - 300 million people annually. Environme...

متن کامل

Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms

In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...

متن کامل

Diagnosis of Diabetes Using an Intelligent Approach Based on Bi-Level Dimensionality Reduction and Classification Algorithms

Objective: Diabetes is one of the most common metabolic diseases. Earlier diagnosis of diabetes and treatment of hyperglycemia and related metabolic abnormalities is of vital importance. Diagnosis of diabetes via proper interpretation of the diabetes data is an important classification problem. Classification systems help the clinicians to predict the risk factors that cause the diabetes or pre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008